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LONG-TERM  GOALS 

The  Bayesian  Hierarchical  Model  (BHM)  methodology  is  exploited  to  identify,  characterize,  and 
model  the  irreducible  model  error  in  ocean  data  assimilation  and  forecast  systems. 

OBJECTIVES 

We  describe  4  objectives  addressed  in  the  fiscal  year  October  2012  -  September  2013. 

First,  we  seek  to  extend  the  proof-of-concept  results  comparing  a  BHM  surface  wind  ensemble 
with  the  increments  in  the  surface  momentum  flux  control  vector  in  a  four-dimensional 
variational  (4dvar)  assimilation  system.  The  current  objective  is  to  convert  BHM  surface  wind 
realizations  to  create  an  ensemble  of  surface  stress  vectors. 

Second,  continuing  the  effort  to  understand  irreducible  model  error  induced  by  representing  the 
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ocean  state  vector  on  a  discrete  grid,  the  current  objective  is  to  estimate  the  Hellinger  distance 
between  posterior  distributions  described  in  the  next  section. 

Third,  we  have  extended  the  hierarchical  models  for  stochastic  time- varying  error-covariance 
matrices  associated  with  data  assimilation  to  include  the  case  where  both  the  observation  and 
background  error  covariances  are  updated,  yet  dependent  upon  each  other. 

Fourth,  we  have  extended  the  emulator-assisted  data  assimilation  methodology  by  extending  the 
parameterization  of  the  spectral  quadratic  nonlinear  spatio-temporal  models  to  accommodate  the 
inclusion  of  nonlinear  interactions  from  small  scales  to  inform  the  evolution  of  large  scale  modes. 

APPROACH 

Converting  Surface  Wind  Realizations  to  a  Surface  Stress  Ensemble:  Sea-level  pressure  (SLP) 
and  surface  air  and  dew  point  temperature  fields  ( Ta  and  Tj,  respectively)  for  the  Mediterranean 
Sea  were  obtained  from  collaborators  (Prof.  Nadia  Pinardi)  at  Istituto  Nazionale  di  Geofisica  e 
Vulcanologia  (INGV)  in  Bologna.  These  fields  are  used  with  ensemble  winds  from  the  BHM  due 
to  Milliff  et  al.  (2011)  to  derive  ensemble  surface  stress  realizations  as  outlined  in  the  flowchart 
shown  in  Figure  1 . 

Model  Error  Arising  from  a  Discrete  Grid:  Let  the  parameter  0  denote  the  ocean  state  (velocities 
and  diffusion  coefficients)  in  a  deep  layer  in  the  South  Atlantic,  as  shown  in  Figure  2.  Define  two 
posterior  distributions  as  follows: 

•  p(6\Y)  -which  we  call  model  M.  Here  the  data  Y  are  tracer  concentration  measurements 
which  have  been  averaged  to  the  nearest  site  on  a  regular  spatial  grid  (Fig  2);  and 

•  p(Q\Y)  -  which  we  call  model  M.  Here  the  data  Y  are  tracer  concentration  measurements 
available  at  the  original  spatial  locations  (Fig  2). 

The  Hellinger  distance  between  the  two  posterior  distributions  is  defined  as 

H  =  (vMeifWpW))  de 

Obtaining  an  estimate  of  H  gives  us  a  measure  of  discrepancy  between  the  two  models  M  and  M . 

We  work  under  the  assumption  that  model  M  can  be  explored  efficiently  via  Markov-Chain 
Monte  Carlo  (MCMC)  methods  and  that  model  M  can  be  explored  via  parallel  computing.  We 
derive  a  Monte  Carlo  estimate  of  H  which  requires  the  following  components: 

(i)  A  MCMC  exploration  of  model  M  ;  Assume  0i ,  02 , ...  0b  represents  a  Markov  chain  which 
explores  the  target  p(6\Y). 

(ii)  An  evaluation  of  the  un-normalized  posteriors  p(9i\Y),  i  —  1 .....  0  which  can  be  obtained 
efficiently  via  parallel  computing; 

(iii)  An  estimate  of  p(6*\Y)  and  p(6*\Y)  at  one  single  user  selected  value  0*. 
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Surface  Momentum  Flux  Ensembles  from  Summaries  of  BHM  Winds  (Mediterranean) 

with  Nadia  Pinardi,  Claudia  Fratianni,  Chris  Wikle,  Jeremiah  Brown 
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Figure  1:  Flow  chart  demonstrating  path  from  input  analysis  fields  to  specific  humidity  and 
atmospheric  density  approximations.  These  are  combined  with  estimates  for  drag  coefficient  and  surface 
wind  speed  given  ensemble  winds  from  a  Bayesian  Hierarchical  Model  to  provide  surface  momentum 

flux  ensembles. 
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40W  20W  0 

Figure  2:  Domain  of  interest :  squares  indicate  spatial  locations  where  tracer  concentration 
measurements  are  available;  circles  indicate  a  regular  19  x  37  spatial  grid. 


Time -Varying  Error  Covariance  Models:  Extending  the  time-varying  covariance  methodology 
described  in  Dobricic  et  al.  (2013),  we  consider  modeling  the  simultaneous  evolution  of  the 
background  and  observation  error  covariance  matrices  of  a  spatial  field  for  use  in  data 
assimilation.  In  this  context,  there  are  two  sources  of  data,  one  corresponding  to  historical  model 
misfits  (related  to  the  background  error  covariance)  and  one  associated  with  in  situ  observations, 
such  as  those  from  ARGO  floats  in  the  ocean.  Critically,  both  covariance  matrices  are  conditioned 
on  a  common  reduced-dimensional  covariance  structure  whose  elements  are  allowed  to  evolve  in 
a  Markovian  fashion  through  a  Cholesky  decomposition  formulation.  Although  both  the 
background  and  observation  error  covariance  matrices  are  conditioned  on  this  common 
time-varying  low-rank  matrix,  the  overall  structure  is  somewhat  different  between  them  because 
the  low-rank  matrix  gets  transformed  differently.  This  allows  common  structure  but  allows  one  to 
capture  the  scale  variations  between  the  two  types  of  error  processes. 

Emulator  Assisted  Data  Assimilation:  We  are  interested  in  reduced-rank  parametric  emulators 
for  non-linear  spatio-temporal  error  processes.  We  allow  the  spatial  process  at  time  t  (say,  Yf)  to 
be  decomposed  in  terms  of  two  basis  expansions: 

Y  t  =  <i>Wat  +  <t>Wpt  +  vt, 

where  <J>^,  i  =  1,2  correspond  to  n  x  p  and  n  x  q  matrices  containing  large-scale  and  small-scale 
basis  functions,  respectively.  Then,  at  and  (f  correspond  to  vectors  of  large-scale  and  small-scale 
expansion  coefficients  of  lengths  p  and  q,  respectively,  and  vt  an  error  process  assumed  to  be 
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independent  of  at  and  /3r  Our  primary  interest  is  in  the  evolution  of  the  large-scale  coefficients 
given  by  at,  as  it  is  often  the  case  in  real-world  processes  wherein  the  important  dynamics  exist 
on  a  lower-dimensional  manifold.  Critical  to  our  model  development  is  allowing  the  propagation 
of  a,  to  (Xt+x  (where  t  >  1)  to  be  influenced  by  the  small-scale  coefficients  but  not  allowing  at 
to  influence  /3/+T  directly  in  the  dynamical  formulation.  This  provides  a  physically  realistic  way 
in  which  to  reduce  the  parameter  space  in  the  rank-reduced  general  quadratic  nonlinearity  (GQN) 
formulation  (e.g.,  Wikle  and  Hooten,  2010;  Leeds  et  al.  2013).  Specifically,  we  consider  the 
following  model  for  the  evolution  of  at 

at+x  —  Maaf  +  (Ip ®  & (aty)Magat  +  Mp  Lf3t  +  {\p  ®&(f5t)')M.p  Qpt  +  Tjf+T,  (1) 

for  t  —  1 ....  .  7  and  some  appropriate  time  increment  t,  where  r)t  ~  Gau(0,  Qa),  Ma  corresponds 
to  the  linear  evolution  of  coefficients  for  the  at  process,  M a  Q  corresponds  to  the  nonlinear 
evolution  coefficients  for  the  at  process,  Mg  L  corresponds  to  the  linear  interactions  between  /3r 
and  at+T,  Mg  q  corresponds  to  the  nonlinear  interactions  between  j8?  and  their  impact  on  a/+T, 
and  Qa  is  a  pxp  covariance  matrix.  Note  that  Ma  and  M^  L  arc  p  x  p  and  p  x  q  matrices  while 
M a  Q  and  Mg  g  are  p2  x  p  and  pqx  p  matrices.  Though  we  may  consider  a  variety  of 
transformation  functions  Sf  (•),  for  the  applications  of  interest  in  our  DA  examples,  it  is  reasonable 
to  specify  this  function  be  the  identity  and  hence  define  &(oct)  =  oct  (similarly  for  /3?).  A  major 
challenge  in  the  implementation  of  this  methodology  is  the  development  of  efficient  sampling 
algorithms. 

WORK  COMPLETED 

Converting  Surface  Wind  Realizations  to  a  Surface  Stress  Ensemble:  Codes  have  been  developed 
to  follow  the  progression  toward  a  surface  momentum  flux  (e.g.  surface  stress  vector)  ensemble 
following  Fig  1  using  the  analysis  fields  provided  by  INGV.  Figure  3  depicts  the  intermediate 
output  stages  up  to  the  surface  density  ( pa )  calculation. 

Following  Large  (2006)  we  approximate  the  surface  drag  coefficient  as  noted  in  Fig  1.  The 
coefficients  ap  i  =  1, 2, 3  can  be  treated  as  random  variables  to  be  estimated  in  the  BHM.  This 
enhancement  is  left  for  later  work.  For  now,  we  use  the  ensemble  winds  from  the  BHM  due  to 
Milliff  et  al.  (201 1)  and  convert  each  surface  vector  wind  realization  to  a  surface  momentum 
stress  realization  according  to  the  flowchart  in  Fig  1. 

Model  Error  Arising  from  a  Discrete  Grid:  We  have  implemented  and  tested  this  approach  on  a 
series  of  examples  where  the  Hellinger  distance  can  be  computed  analytically.  We  are  currently 
implementing  this  approach  for  a  synthetic  example  in  which  the  forward  model  is  approximated 
via  a  first  order  emulator  (see  Hooten  et  al.,  201 1).  We  are  also  implementing  this  approach  for 
the  oceanographic  tracer  inversion  problem  described  in  Herbei  and  Berliner,  (2012). 

Time -Varying  Error  Covariance  Models:  The  time- varying  error  covariance  methodology  for 
simultaneous  estimation  of  background  and  observation  error  covariance  matrices  has  been 
implemented  in  several  simulated  examples  and  with  an  application  associated  with  data 
assimilation  in  the  Mediterranean  Sea  (e.g.,  see  Dobricic  et  al.  2013).  This  has  been  written-up  as 
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Figure  3:  Input  analysis  fields  from  ECMWF  via  INGVfor  29  November  2008  at  1200  UTC.  The  top  2 
panels  are  atmospheric  temperature  and  dew  point  temperature  at  2  m,  in  °K.  The  third  panel  is  sea  level 
pressure  in  liPa.  These  fields  are  combined  to  estimate  atmospheric  density  (bottom  panel)  in  kgm~ 3. 
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a  chapter  in  the  dissertation  of  Dan  Gladish  (U.  Missouri)  and  is  in  the  process  of  being  turned 
into  a  paper  for  publication. 

Emulator  Assisted  Data  Assimilation:  The  development  of  the  scale-interaction  GQN  model  is 
outlined  in  a  chapter  in  the  dissertation  of  Dan  Gladish  (U.  Missouri)  and  will  be  submitted  in 
early  October  to  a  special  issue  of  the  journal  Environmetrics  that  is  focused  on 
physical- statistical  modeling  in  the  environmental  sciences. 

Relevant  Meetings  and  Presentations: 

(Wikle)  Invited ;,  Hierarchical  general  quadratic  nonlinear  models  for  spatio-temporal  dynamics. 
Red  Raider  Conference,  Texas  Tech  University,  Lubbock,  TX,  October  2012. 

(Wikle)  Invited;  Efficient  time -frequency  representations  in  high-dimensional  spatial  and 
spatio-temporal  models.  Invited  Talk,  ASA  ENVR  Workshop  on  Environmetrics,  North  Carolina 
State  University,  October  2012. 

(Herbei,  Berliner)  Poster;  Estimating  ocean-circulation:  a  likelihood-free  approach  via  a 
Bernoulli  factory.  Institute  for  Mathematics  and  its  Applications,  Workshop  on  Stochastic 
Modeling  of  the  Ocean  and  Atmosphere;  U.  Minnesota,  March  2013. 

(Milliff)  Invited;  Uncertainty  in  Ensemble  Ocean  Forecasts;  Deducing  Ocean  Model  Error  with 
Ensemble  Winds  from  a  Bayesian  Hierarchical  Model,  Institute  for  Mathematics  and  its 
Applications,  Workshop  on  Stochastic  Modeling  of  the  Ocean  and  Atmosphere;  U.  Minnesota, 
March  2013. 

(Wikle)  Invited;  Using  quadratic  nonlinear  statistical  emulators  to  facilitate  ocean  biogeochemical 
data  assimilation,  Institute  for  Mathematics  and  its  Applications,  Workshop  on  Stochastic 
Modeling  of  the  Ocean  and  Atmosphere;  U.  Minnesota,  March  2013. 

(Wikle)  Invited;  Statistics  and  the  environment:  Overview  and  challenges.  41st  Annual  Meeting 
of  the  Statistical  Society  of  Canada,  Edmonton,  Alberta,  Canada;  May  2013. 

(Wikle)  Invited;  Nonlinear  Dynamic  Spatio-Temporal  Statistical  Models.  Southern  Regional 
Council  on  Statistics  Summer  Research  Conference;  June  2013. 

(Milliff)  Invited;  Bayesian  Hierarchical  Model  Applications  in  Ocean  Forecasting,  Society  for 
Industrial  and  Applied  Mathematics,  Annual  Meeting  Minisymposium  on  Uncertainty 
Quantification  in  Climate  Modeling  and  Prediction;  San  Diego,  CA,  July  2013. 

(Milliff)  Invited;  A  Tale  of  Two  Bayesian  Hierarchical  Models,  IMAGe  2013  Climate  Analytics 
Theme-of-the-Year  Workshop;  Next  Generation  Climate  Data  Products;  July  2013. 

(Herbei)  Exact  MCMC  Using  Approximations,  Joint  Statistical  Meetings,  Montreal,  CANADA; 
August,  2013. 

(Wikle)  Invited  Keynote  Lecture;,  Nonlinear  dynamic  spatio-temporal  statistical  models.  Third 
Workshop  on  Bayesian  Inference  and  Latent  Gaussian  Models  with  Applications,  Reykjavik, 
Iceland,  September  2013. 

RESULTS 

Converting  Surface  Wind  Realizations  to  a  Surface  Stress  Ensemble:  Sample  surface  momentum 
flux  ensembles  are  shown  for  a  snapshot  in  the  western  Mediterranean  Sea  in  Figure  4.  As  noted 
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Figure  4:  Sample  surface  momentum  flux  vectors  at  1200  UTC  on  29  Nov  2008.  Ensemble  f  are 
obtained  as  summaries  of  ensemble  winds  from  a  BHM  (Milliff  et  al.,  2011),  given  input  surface  and  2m 

analysis  fields. 


in  the  progress  report  from  last  year,  ensemble  surface  wind  stress  estimates  from  the  BHM  can 
be  used  to  diagnose  loci  of  potential  model  error  in  4dvar  ocean  data  assimilation  systems.  If  the 
iterations  in  the  strong-constraint  4dvar  between  forward  and  adjoint  models  shifts  the  surface 
flux  control  vector  outside  the  probabilistic  ensemble  estimated  in  the  BHM,  there  is  the 
possibility  that  the  control  vector  variable  is  being  used  to  correct  for  model  error  rather  than 
forcing  error.  This  was  demonstrated  for  the  California  Current  System  ROMS  4dvar  forecast 
system  due  to  Prof.  Andrew  M.  Moore  and  colleagues. 

Our  next  focus  will  be  on  similar  BHM  ensemble  developments  for  the  other  variables  comprising 
the  forcing  part  of  the  control  vector  in  ROMS  4dvar;  e.g.  surface  heat  and  fresh-water  fluxes. 

Model  Error  Arising  from  a  Discrete  Grid:  Our  initial  tests  show  that  the  estimate  we  propose 
worked  well  in  all  cases.  We  are  currently  running  simulations  for  the  oceanographic  tracer 
concentration  inversion. 

Time-Varying  Error  Covariance  Models:  We  have  preliminary  results  for  an  application  in  the 
Mediterranean  Sea,  which  uses  the  data  described  in  Dobricic  et  al.  (2013).  This  is  currently  in  a 
draft  chapter  of  Dan  Gladish’s  dissertation  (U.  Missouri),  the  final  version  of  which  will  be 
submitted  in  November,  2013.  At  that  time,  we  expect  to  finish  converting  the  chapter  to  a  paper 
for  submission. 

Emulator  Assisted  Data  Assimilation:  We  have  preliminary  results  for  an  application  to 
long-lead  sea  surface  temperature  prediction  in  the  tropical  Pacific  ocean,  as  well  as  6-24  hour 
forecasts  of  SLP  over  the  midwest  USA.  These  currently  reside  in  a  draft  chapter  of  Dan 
Gladish’s  dissertation  (U.  Missouri),  the  final  version  of  which  will  be  submitted  in  November, 
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2013.  We  are  in  the  process  of  converting  this  chapter  to  paper  for  submission  to  a  special  issue  of 
Environmetrics  in  early  October  2013. 


IMPACT/APPLICATIONS 

Our  research  thus  far,  demonstrates  the  wide  scope  of  applicability  of  the  BHM  methodology  in 
characterizing,  identifiing  and  modelling  irreducible  model  error  in  ocean  forecast  systems.  Our 
work  is  leading  to  operationally  useful  estimations  of  the  space-time  properties  of  uncertainties  in 
these  systems. 

TRANSITIONS 

Presentations  and  discussions  at  the  Bayesian  Confab  meeting  in  Boulder,  31  July  -  2  August 
2013,  focused  on  Irreducible  Model  Error  issues. 

RELATED  PROJECTS 

“Estimating  Ecosystem  Model  Uncertainties  in  Pan-Regional  Syntheses  and  Climate  Change 
Impacts  on  Coastal  Domains  of  the  North  Pacific  Ocean”,  NSF  US  Globec  Program,  October 
2009  -  September  2012. 

“Quantifying  the  Amplitude,  Structure  and  Influence  of  Model  Error  during  Ocean  Analysis  and 
Forecast  Cycles”,  ONR  Physical  Oceanography  Program,  A.  Moore  (PI). 

“Ocean  Surface  Vector  Winds  in  Multi-Platform  Bayesian  Hierarchical  Model  Applications”, 
International  Ocean  Vector  Winds  Science  Team,  NASA  Physical  Oceanography  Program,  R. 
Milliff  (PI). 

’’Bayesian  Hierarchical  Climate  Prediction”,  NSF,  April  2011  -  March  2014,  C.K.  Wikle  and 
L.M.  Berliner  (Pis) 
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